PatchSizePilot

Overview

Recap

Meta-ecosystems have been studied looking at meta-ecosystems in which patch size was the same. However, of course, we know that meta-ecosystems are mad out of patches that have different size. To see the effects of patch size on meta-ecosystem properties, we ran a four weeks protist experiment in which different ecosystems were connected through the flow of nutrients. The flow of nutrients resulted from a perturbation of the ecosystems in which a fixed part of the cultures was boiled and then poored into the receiving patch. This had a fixed volume (e.g., small perturbation = 6.75 ml) and was the same across all patch sizes. The experiment design consisted in crossing two disturbances with a small, medium, and large isolated ecosystems and with a small-small, medium-medium, large-large, and small-large meta-ecosystem. We took videos every four days and we create this perturbation and resource flow the day after taking videos. We skipped the perturbation the day after we assembled the experiment so that we would start perturbing it when population densities were already high.

We had mainly two research questions:

  • Do local properties of a patch depend upon the size of the patch it is connected to?

  • Do regional properties of a meta-ecosystem depend upon the relative size of its patch?

Design

Lab work

Creation of high-density monocultures

23/3/22 PPM for increasing the number of monocultures in the collection.

24/3/22 Collection control. See monoculture maintenance lab book p. 47.

26/3/22 Increase of number of monocultures in the collection. To do so, take the best culture and make 3 new ones. See monoculture maintenance lab book p. 47.

1/4/22 Make PPM for high density monocultures. See PatchSizePilot lab book p. 5.

3/4/22 Make bacterial solution for high density monocultures. See PatchSizePilot lab book p. 8.

5/4/22 Grow high density monocultures. Make 3 high density monocultures for each protist species with 200 ml with 5% bacterial solution, 85% PPM, 10% protists, and 2 seeds. See PatchSizePilot lab book p. 10

10/4/2022 Check high density monocultures. Cep, Eup, Spi, Spi te were really low.

13/4/2022 Start of the experiment. See PatchSizePilot p. 33.

Things I could have done better

- Autoclave all the material in advance

- Get more high-density monocultures

- Decide in advance the days in which you are going to check the high-density monocultures and prepare bacteria in advance for that day so that if some of them crashed you are still on time to make new ones.

- Use a single lab book for also when you create PPM and check the collection.

- Make a really high amount of PPM, as you will need for so many different things (>10 L). Maybe also autoclave 1 L Schott bottles so that you don’t have to oxygenate whole 5 L bottles of PPM. I think that I should have maybe made even a 10 L bottle of PPM.

- According to Silvana protists take 4-7 days to grow. The fastest is Tet (ca 4 days) and the slowest is Spi (ca 7 days). Once that you grow them they should stay at carrying capacity for a bit of time I guess, as you can see in the monoculture collection. I should make sure I’m growing them in the right way. I think that maybe I should grow them 10 days in advance so that I could actually grow also the slow species if they crashed. What should I do if all of them crashed?

Biomass

Aim

Here I study how biomass density changes across treatments in the PatchSizePilot. In particular, my research questions are:

  • How does biomass density change regionally?

    • Do meta-ecosystems with the same total size but with patches that are either the same size or of different size have a different biomass density? (Do the medium-medium and small-large meta-ecosystems have different biomass density?)

    • And if they do, is it because of resource flow? Or would we see this also with small and large ecosystems that are not connected? (Do the small-large meta-ecosystems have different biomass density from two isolated small and large patches?)

  • How does biomass density change locally?
    • How does biomass density change according to the size to which the patch is connected? (Does a small patch connected to a small patch have the same biomass density than a small patch connected to a large patch? And does a large patch connected to a large patch have the same biomass density than a large patch connected to a small patch?)

Data

Experimental cultures (culture_info)

This table contains information about the 110 cultures of the experiment.

culture_info = read.csv(here("data", "PatchSizePilot_culture_info.csv"), header = TRUE)
datatable(culture_info[,1:10],
          rownames = FALSE,
          options = list(scrollX = TRUE),
          filter = list(position = 'top', 
                        clear = FALSE))

Local biomass dataset (ds_biomass)

This dataset is the master dataset containing all the information about the biomass of the experiment.

load(here("data", "population", "t0.RData")); t0 = pop_output
load(here("data", "population", "t1.RData")); t1 = pop_output
load(here("data", "population", "t2.RData")); t2 = pop_output
load(here("data", "population", "t3.RData")); t3 = pop_output
load(here("data", "population", "t4.RData")); t4 = pop_output
load(here("data", "population", "t5.RData")); t5 = pop_output
load(here("data", "population", "t6.RData")); t6 = pop_output
load(here("data", "population", "t7.RData")); t7 = pop_output
rm(pop_output)
#Column: time
t0$time = NA
t1$time = NA

#Column: replicate_video
t0$replicate_video = 1:12 #In t1 I took 12 videos of a single 
t1$replicate_video = 1 #In t1 I took only 1 video/culture
t2$replicate_video = 1 #In t2 I took only 1 video/culture
t3$replicate_video = 1 #In t3 I took only 1 video/culture
t4$replicate_video = 1 #In t4 I took only 1 video/culture
t5$replicate_video = 1 #In t5 I took only 1 video/culture
t6 = t6 %>%
  rename(replicate_video = replicate)
t7 = t7 %>%
  rename(replicate_video = replicate)
#Elongate t0 (so that it can be merged wiht culture_info)
number_of_columns_t0 = ncol(t0)
nr_of_cultures = nrow(culture_info)
nr_of_videos = nrow(t0)

t0 = t0[rep(row.names(t0), nr_of_cultures), 1:number_of_columns_t0] %>%
  arrange(file) %>%
  mutate(culture_ID = rep(1:nr_of_cultures, times = nr_of_videos))

#Merge time points
t0 = merge(culture_info,t0, by="culture_ID")
t1 = merge(culture_info,t1, by = "culture_ID")
t2 = merge(culture_info,t2, by = "culture_ID")
t3 = merge(culture_info,t3, by = "culture_ID")
t4 = merge(culture_info,t4, by = "culture_ID")
t5 = merge(culture_info,t5, by = "culture_ID")
t6 = merge(culture_info,t6, by = "culture_ID")
t7 = merge(culture_info,t7, by = "culture_ID")
ds_biomass = rbind(t0, t1, t2, t3, t4, t5, t6, t7)
rm(t0, t1, t2, t3, t4, t5, t6, t7)
#Take off spilled cultures
ds_biomass = ds_biomass %>%
  filter(! culture_ID %in% ecosystems_to_take_off)

#Column: time_point
ds_biomass$time_point[ds_biomass$time_point=="t0"] = 0
ds_biomass$time_point[ds_biomass$time_point=="t1"] = 1
ds_biomass$time_point[ds_biomass$time_point=="t2"] = 2
ds_biomass$time_point[ds_biomass$time_point=="t3"] = 3
ds_biomass$time_point[ds_biomass$time_point=="t4"] = 4
ds_biomass$time_point[ds_biomass$time_point=="t5"] = 5
ds_biomass$time_point[ds_biomass$time_point=="t6"] = 6
ds_biomass$time_point[ds_biomass$time_point=="t7"] = 7
ds_biomass$time_point = as.character(ds_biomass$time_point)

#Column: day
ds_biomass$day = NA
ds_biomass$day[ds_biomass$time_point== 0] = 0
ds_biomass$day[ds_biomass$time_point== 1] = 4
ds_biomass$day[ds_biomass$time_point== 2] = 8
ds_biomass$day[ds_biomass$time_point== 3] = 12
ds_biomass$day[ds_biomass$time_point== 4] = 16
ds_biomass$day[ds_biomass$time_point== 5] = 20
ds_biomass$day[ds_biomass$time_point== 6] = 24
ds_biomass$day[ds_biomass$time_point== 7] = 28

#Column: size_of_connected_patch
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "S"] = "S"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "S (S_S)"] = "S"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "S (S_L)"] = "L"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "M (M_M)"] = "M"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "L"] = "L"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "L (L_L)"] = "L"
ds_biomass$size_of_connected_patch[ds_biomass$eco_metaeco_type == "L (S_L)"] = "S"

#Keep this dataset for the evaporation effects 
ds_for_evaporation = ds_biomass

ds_biomass = ds_biomass %>% 
  select(culture_ID, 
         patch_size,
         patch_size_volume,
         disturbance, 
         metaecosystem_type, 
         bioarea_per_volume, 
         replicate_video, 
         time_point,
         day,
         metaecosystem, 
         system_nr, 
         eco_metaeco_type,
         size_of_connected_patch) %>%
  relocate(culture_ID,
           system_nr,
           disturbance,
           time_point,
           day,
           patch_size,
           patch_size_volume,
           metaecosystem,
           metaecosystem_type,
           eco_metaeco_type,
           size_of_connected_patch,
           replicate_video,
           bioarea_per_volume)
datatable(ds_biomass,
          rownames = FALSE,
          options = list(scrollX = TRUE),
          filter = list(position = 'top', 
                        clear = FALSE))

Regional biomass data set (ds_regional_biomass)

This is the dataset of the regional biomass of different meta-ecosystems. It contains also the regional biomass of the combination of a small isolated and a large isolated patch (S_L_from_isolated).

ds_regional_biomass = ds_biomass %>%
  filter(metaecosystem == "yes") %>%
  filter(! system_nr %in% metaecosystems_to_take_off) %>%
  group_by(culture_ID, 
           system_nr, 
           disturbance, 
           time_point,
           day, 
           patch_size,
           patch_size_volume,
           metaecosystem_type) %>%
  summarise(bioarea_per_volume_video_averaged = mean(bioarea_per_volume)) %>%
  mutate(total_patch_bioarea = bioarea_per_volume_video_averaged * patch_size_volume) %>%
  group_by(system_nr, 
           disturbance, 
           time_point,
           day,
           metaecosystem_type) %>%
  summarise(total_regional_bioarea = sum(total_patch_bioarea))
isolated_S_and_L = ds_biomass %>%
  filter(eco_metaeco_type == "S" | eco_metaeco_type == "L") %>%
  group_by(system_nr, disturbance, time_point, day, eco_metaeco_type) %>%
  summarise(bioarea_per_volume_across_videos = mean(bioarea_per_volume))

isolated_S_low = isolated_S_and_L %>%
  filter(eco_metaeco_type == "S") %>%
  filter(disturbance == "low")
isolated_L_low = isolated_S_and_L %>%
  filter(eco_metaeco_type == "L") %>%
  filter(disturbance == "low")
isolated_S_high = isolated_S_and_L %>%
  filter(eco_metaeco_type == "S") %>%
  filter(disturbance == "high")
isolated_L_high = isolated_S_and_L %>%
  filter(eco_metaeco_type == "L") %>%
  filter(disturbance == "high")

S_low_system_nrs = unique(isolated_S_low$system_nr)
S_high_system_nrs = unique(isolated_S_high$system_nr)
L_low_system_nrs = unique(isolated_L_low$system_nr)
L_high_system_nrs = unique(isolated_L_high$system_nr)

low_system_nrs_combination = expand.grid(S_low_system_nrs, L_low_system_nrs) %>%
  mutate(disturbance = "low")
high_system_nrs_combination = expand.grid(S_high_system_nrs, L_high_system_nrs) %>%
  mutate(disturbance = "high")
system_nr_combinations = rbind(low_system_nrs_combination, high_system_nrs_combination) %>%
  rename(S_system_nr = Var1) %>%
  rename(L_system_nr = Var2)

number_of_combinations = nrow(system_nr_combinations)
SL_from_isolated_all_combinations = NULL
for (pair in 1:number_of_combinations){
  
  SL_from_isolated_one_combination = 
    ds_biomass %>%
    filter(system_nr %in% system_nr_combinations[pair,]) %>%
    group_by(disturbance, day, time_point, system_nr) %>%
    summarise(regional_bioarea_across_videos = mean(bioarea_per_volume)) %>%
    group_by(disturbance, day, time_point) %>%
    summarise(total_regional_bioarea = sum(regional_bioarea_across_videos)) %>%
    mutate(system_nr = 1000 + pair) %>%
    mutate(metaecosystem_type = "S_L_from_isolated")
  
  SL_from_isolated_all_combinations[[pair]] = SL_from_isolated_one_combination
  
}

SL_from_isolated_all_combinations_together = NULL
for (combination in 1:number_of_combinations){
 
  SL_from_isolated_all_combinations_together = 
    rbind(SL_from_isolated_all_combinations_together,
          SL_from_isolated_all_combinations[[pair]])
  
}

ds_regional_biomass = rbind(ds_regional_biomass, SL_from_isolated_all_combinations_together)
datatable(ds_regional_biomass,
          rownames = FALSE,
          options = list(scrollX = TRUE),
          filter = list(position = 'top', 
                        clear = FALSE))

Local biomass lnRR data-set (ds_lnRR)

ds_biomass_averaged_treatments = data.frame(eco_metaeco_type = character(),
                                         disturbance = character(),
                                         time_point = integer(),
                                         mean_bioarea_per_volume = double())

eco_metaeco_types = unique(ds_biomass$eco_metaeco_type)

for (disturbance_input in c("low", "high")){
  for (eco_metaeco_input in eco_metaeco_types){
    for (time_point_input in 0:7){
      
      temporary = ds_biomass %>%
        filter(eco_metaeco_type == eco_metaeco_input) %>%
        filter(disturbance == disturbance_input) %>%
        filter(time_point == time_point_input) %>% 
        group_by(culture_ID, eco_metaeco_type, patch_size, disturbance, time_point, day) %>%
        summarise(bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
        group_by(eco_metaeco_type, patch_size, disturbance, time_point, day) %>%
        summarise(mean_bioarea_per_volume = mean(bioarea_per_volume_across_videos))
      
      ds_biomass_averaged_treatments = rbind(ds_biomass_averaged_treatments, temporary)
      
    }
  }
}

ds_biomass_averaged_treatments$isolated_control = NA
for (patch_size_input in c("S", "L")){
  for (disturbance_input in c("low", "high")){
    for (time_point_input in 0:7){
      
      averaged_value_isolated_control = ds_biomass_averaged_treatments %>%
        filter(eco_metaeco_type == patch_size_input) %>%
        filter(disturbance == disturbance_input) %>%
        filter(time_point == time_point_input) %>%
        ungroup() %>%
        select(mean_bioarea_per_volume)
      
      ds_biomass_averaged_treatments$isolated_control[
        ds_biomass_averaged_treatments$patch_size == patch_size_input & 
        ds_biomass_averaged_treatments$disturbance == disturbance_input &
        ds_biomass_averaged_treatments$time_point == time_point_input] = 
        averaged_value_isolated_control
    
    }
  }
}

ds_biomass_averaged_treatments = ds_biomass_averaged_treatments %>%
  filter(!patch_size == "M") %>%
  mutate(isolated_control = as.numeric(isolated_control)) %>%
  mutate(lnRR_biomass = ln(mean_bioarea_per_volume / isolated_control))
#Takes about 3h to run. 
ds_lnRR = data.frame()
iterations_n = 1000
upper_bound = iterations_n * 0.025
lower_bound = iterations_n * 0.975
rows_to_subsample = 5

for (eco_metaeco_type_input in c("S (S_S)", "S (S_L)", "L (L_L)", "L (S_L)")){
  for (disturbance_input in c("low", "high")){
    for (time_point_input in 0:7){
      
      mean_bioarea_isolated = ds_biomass_averaged_treatments %>%
                               filter(eco_metaeco_type == eco_metaeco_type_input) %>%
                               filter(time_point == time_point_input) %>%
                               filter(disturbance == disturbance_input) %>%
                               ungroup() %>%
                               select(isolated_control)
      mean_bioarea_isolated = unlist(mean_bioarea_isolated)
      
      mean_bioarea_all_iterations = NULL
      for (iteration in 1:iterations_n){
        
        mean_bioarea_iteration = ds_biomass %>%
          filter(eco_metaeco_type == eco_metaeco_type_input) %>%
          filter(time_point == time_point_input) %>%
          filter(disturbance == disturbance_input) %>%
          group_by(culture_ID, system_nr, eco_metaeco_type, disturbance,time_point,day) %>%
          summarise(mean_bioarea_per_volume_video_averaged = mean(bioarea_per_volume)) %>%
          group_by(system_nr) %>%
          slice_sample(n = 1) %>%
          ungroup() %>%
          slice_sample(n = rows_to_subsample, 
                       replace = TRUE) %>%
          summarise(mean_bioarea_per_volume = mean(mean_bioarea_per_volume_video_averaged))
        mean_bioarea_all_iterations[iteration] = as.numeric(unlist(mean_bioarea_iteration))
        
        }
      
      mean_bioarea_all_iterations = sort(mean_bioarea_all_iterations, decreasing = TRUE)
      
      mean_bioarea_lower_CI =  mean_bioarea_all_iterations[lower_bound]
      mean_bioarea_density = mean_bioarea_all_iterations[iterations_n/2]
      mean_bioarea_upper_CI =  mean_bioarea_all_iterations[upper_bound]
      
      lnRR_bioarea_lower_CI = ln(mean_bioarea_lower_CI/mean_bioarea_isolated)
      lnRR_bioarea_density = ln(mean_bioarea_density/mean_bioarea_isolated)
      lnRR_bioarea_upper_CI = ln(mean_bioarea_upper_CI/mean_bioarea_isolated)
      
      new_row = nrow(ds_lnRR) + 1 
      ds_lnRR[new_row,] = NA
      ds_lnRR$disturbance[new_row] = disturbance_input
      ds_lnRR$eco_metaeco_type[new_row] = eco_metaeco_type_input
      ds_lnRR$time_point[new_row] = time_point_input
      ds_lnRR$day[new_row] = time_point_input*4
      ds_lnRR$lnRR_lower[new_row] = lnRR_bioarea_lower_CI
      ds_lnRR$lnRR[new_row] = lnRR_bioarea_density
      ds_lnRR$lnRR_upper[new_row] = lnRR_bioarea_upper_CI
      
    }
  }
}

write.csv(ds_lnRR, 
          file = here("results", "biomass", "bootstrapped_lnRR_patches.csv"),
          sep = ",",
          col.names = TRUE)
ds_lnRR = read.csv(here("results", "biomass", "bootstrapped_lnRR_patches.csv"), header = TRUE)
ds_lnRR = ds_lnRR %>%
  rename(lnRR_bioarea_density = lnRR)
datatable(ds_lnRR,
          rownames = FALSE,
          options = list(scrollX = TRUE),
          filter = list(position = 'top', 
                        clear = FALSE))

Regional biomass

All meta-ecosystems

ds_regional_biomass %>%
  filter(disturbance == "low") %>%
  filter(!metaecosystem_type == "S_L_from_isolated") %>%
  ggplot(aes(x = day,
             y = total_regional_bioarea,
             group = interaction(day, metaecosystem_type),
             fill = metaecosystem_type)) +
  geom_boxplot() +
  labs(title = "Disturbance = low",
       x = "Day",
       y = "Total bioarea (µm²)",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large-large",
                                     "medium-medium",
                                     "small-large",
                                 "small-small")) +
  geom_vline(xintercept = first_perturbation_day + 0.5,
             linetype="dotdash",
             color = "grey",
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_regional_biomass %>%
  filter(disturbance == "high") %>%
  filter(!metaecosystem_type == "S_L_from_isolated") %>%
  ggplot(aes(x = day,
             y = total_regional_bioarea,
             group = interaction(day, metaecosystem_type),
             fill = metaecosystem_type)) +
  geom_boxplot() +
  labs(title = "Disturbance = high",
       x = "Day",
       y = "Total bioarea (µm²)",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large-large",
                                     "medium-medium",
                                     "small-large",
                                 "small-small")) +
  geom_vline(xintercept = first_perturbation_day + 0.5,
             linetype="dotdash",
             color = "grey",
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

Medium-Medium vs Small-Large

Do meta-ecosystems with the same total size but with patches that are either the same size or of different size have a different biomass density? (Do the medium-medium and small-large meta-ecosystems have different biomass density?)

Plots
ds_regional_biomass %>%
    filter ( disturbance == "low") %>%
    filter (metaecosystem_type == "S_L" | 
              metaecosystem_type == "M_M") %>%
    ggplot (aes(x = day,
                y = total_regional_bioarea,
                group = system_nr,
                fill = system_nr,
                color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (µm²)",
         title = "Disturbance = low",
         fill = "System nr",
         color = "System nr",
         linetype = "") +
    scale_x_continuous(limits = c(-2, 30)) +
    scale_linetype_discrete(labels = c("medium-medium",
                                     "small-large")) + 
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_regional_biomass %>%
    filter ( disturbance == "high") %>%
    filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
    ggplot (aes(x = day,
                y = total_regional_bioarea,
                group = system_nr,
                fill = system_nr,
                color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (µm²)",
         title = "Disturbance = high",
         fill = "System nr",
         color = "System nr",
         linetype = "") +
    scale_x_continuous(limits = c(-2, 30)) +
    scale_linetype_discrete(labels = c("medium-medium",
                                     "small-large")) + 
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

regional_publication = ds_regional_biomass %>%
  filter(disturbance == "low") %>%
  filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
  ggplot (aes(x = day,
              y = total_regional_bioarea,
              group = interaction(day, metaecosystem_type),
              fill = metaecosystem_type)) +
  geom_boxplot() +
  labs(x = "Day", 
       y = "Regional bioarea (µm²)",
       title = "Disturbance = low",
       color='', 
       fill='') +
  scale_fill_discrete(labels = c("medium-medium", 
                                 "small-large")) +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6))  +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")
regional_publication

ds_regional_biomass %>%
  filter(disturbance == "high") %>%
  filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
  ggplot (aes(x = day,
              y = total_regional_bioarea,
              group = interaction (day, metaecosystem_type),
              fill = metaecosystem_type)) +
  geom_boxplot() +
  labs(x = "Day", 
       y = "Regional bioarea (µm²)",
       title = "Disturbance = high",
       color='', 
       fill='') +
  scale_fill_discrete(labels = c("medium-medium", 
                                 "small-large")) +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

Model time series

How does the biomass density of medium-medium and small-large meta-ecosystems differ across the time series? (The first two points before the first disturbance are taken off).

We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.

Let’s see how linear is the time trend of bioarea and if we can make it more linear with a log10 transformation. We are lucky that during the modelling process we need to drop the first two time points, which would have made the biomass trend not linear.

Linearity of regional bioarea ~ time

ds_regional_biomass %>%
  filter(time_point >= 2) %>%
  filter(!metaecosystem_type == "S_L_from_isolated") %>%
  ggplot(aes(x = day,
             y = total_regional_bioarea,
             group = day)) +
  geom_boxplot() +
  labs(x = "Day",
       y = "Regional bioarea (µm²)")

linear_model = lm(total_regional_bioarea ~ 
                    day, 
                  data = ds_regional_biomass %>% 
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

par(mfrow=c(2,3))
plot(linear_model, which = 1:5)

Model selection

Let’ start from the full model.

\[ Total \: Regional \: Bioarea = t + M + D + tM + tD + MD + tDM + (t | system \: nr) \]

full = lmer(total_regional_bioarea ~
                     day * metaecosystem_type * disturbance +
                     (day | system_nr),
                     data = ds_regional_biomass %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

Should we keep the correlation in (day | system_nr)?

no_correlation = lmer(total_regional_bioarea ~
                     day * metaecosystem_type * disturbance +
                     (day || system_nr),
                     data = ds_regional_biomass %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(full, no_correlation)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_correlation: total_regional_bioarea ~ day * metaecosystem_type * disturbance + ((1 | system_nr) + (0 + day | system_nr))
## full: total_regional_bioarea ~ day * metaecosystem_type * disturbance + (day | system_nr)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## no_correlation   11 2785.8 2816.5 -1381.9   2763.8                     
## full             12 2786.3 2819.8 -1381.2   2762.3 1.5333  1     0.2156

No.

Should we keep the random effect of system nr on the time slopes (day | system_nr)?

no_random_slopes = lmer(total_regional_bioarea ~
                     day * metaecosystem_type * disturbance +
                     (1 | system_nr),
                     data = ds_regional_biomass %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(no_correlation, no_random_slopes)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_random_slopes: total_regional_bioarea ~ day * metaecosystem_type * disturbance + (1 | system_nr)
## no_correlation: total_regional_bioarea ~ day * metaecosystem_type * disturbance + ((1 | system_nr) + (0 + day | system_nr))
##                  npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)
## no_random_slopes   10 2783.8 2811.7 -1381.9   2763.8                    
## no_correlation     11 2785.8 2816.5 -1381.9   2763.8     0  1          1

No.

Should we keep t * M * D?

no_threeway = lmer(total_regional_bioarea ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : metaecosystem_type + 
                     day : disturbance +
                     metaecosystem_type : disturbance + 
                     (1 | system_nr),
                     data = ds_regional_biomass %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = 'optimx', 
                                         optCtrl = list(method = 'L-BFGS-B')))

anova(no_random_slopes, no_threeway)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_threeway: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (1 | system_nr)
## no_random_slopes: total_regional_bioarea ~ day * metaecosystem_type * disturbance + (1 | system_nr)
##                  npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## no_threeway         9 2781.9 2807.0 -1382.0   2763.9                     
## no_random_slopes   10 2783.8 2811.7 -1381.9   2763.8 0.0948  1     0.7582

No.

Should we keep t * M?

no_TM = lmer(total_regional_bioarea ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : disturbance +
                     metaecosystem_type : disturbance + 
                     (1 | system_nr),
                     data = ds_regional_biomass %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(no_threeway,no_TM)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_TM: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:disturbance + metaecosystem_type:disturbance + (1 | system_nr)
## no_threeway: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (1 | system_nr)
##             npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## no_TM          8 2787.3 2809.6 -1385.7   2771.3                        
## no_threeway    9 2781.9 2807.0 -1382.0   2763.9 7.3941  1   0.006544 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Yes.

Should we keep t * D?

no_TD = lmer(total_regional_bioarea ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : metaecosystem_type + 
                     metaecosystem_type : disturbance + 
                     (1 | system_nr),
                     data = ds_regional_biomass %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(no_threeway, no_TD)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_TD: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + metaecosystem_type:disturbance + (1 | system_nr)
## no_threeway: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (1 | system_nr)
##             npar    AIC    BIC logLik deviance Chisq Df Pr(>Chisq)
## no_TD          8 2779.9 2802.2  -1382   2763.9                    
## no_threeway    9 2781.9 2807.0  -1382   2763.9 0.021  1     0.8847

No.

Should we keep M * D?

no_MD = lmer(total_regional_bioarea ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : metaecosystem_type + 
                     (1 | system_nr),
                     data = ds_regional_biomass %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(no_TD, no_MD)
## Data: ds_regional_biomass %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_MD: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + (1 | system_nr)
## no_TD: total_regional_bioarea ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + metaecosystem_type:disturbance + (1 | system_nr)
##       npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## no_MD    7 2778.3 2797.8 -1382.2   2764.3                     
## no_TD    8 2779.9 2802.2 -1382.0   2763.9 0.3541  1     0.5518

No.

Best model

Therefore, our best model is:

\[ Regional \: bioarea = t + M + D + tM + (1 | system \: nr) \]

best_model = no_MD

Let’s do some model diagnostics:

plot(best_model)

qqnorm(resid(best_model))

The R squared of this model for t2-t7 are:

R2_marginal = r.squaredGLMM(best_model)[1]
R2_marginal = round(R2_marginal, digits = 2)
R2_conditional = r.squaredGLMM(best_model)[2]
R2_conditional = round(R2_conditional, digits = 2)
  • Marginal R2 = 0.76

  • Conditional R2 = 0.78

Let’s just assume that this model holds also for t2-t5. Then, let’s recalculate the R squared.

t2_t5 = lmer(total_regional_bioarea ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : metaecosystem_type + 
                     (1 | system_nr),
                     data = ds_regional_biomass %>%
                            filter(time_point >= 2) %>%
                            filter(time_point <= 5) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

plot(t2_t5)

qqnorm(resid(t2_t5))

R2_marginal = r.squaredGLMM(t2_t5)[1]
R2_marginal = round(R2_marginal, digits = 2)
R2_conditional = r.squaredGLMM(t2_t5)[2]
R2_conditional = round(R2_conditional, digits = 2)

The R squared of this model for t2-t5 are:

  • Marginal R2 = 0.6

  • Conditional R2 = 0.61

### --- Work in progress: calculating R2 of M --- ###

R2_regional = partR2(best_model,
       partvars = c("day", 
                    "metaecosystem_type", 
                    "disturbance"),
       R2_type = "marginal", 
       nboot = 1000, 
       CI = 0.95)
saveRDS(R2_regional, file = here("results", "biomass", "R2_regional.RData"))
readRDS(here("results", "biomass", "R2_regional.RData"))
Model single points

How does the biomass density of medium-medium and small-large meta-ecosystems differ for each time point? (The first two points before the first disturbance are taken off).

Let’s now look at the full model and see if we should keep the interaction between meta-ecosystem type and disturbance. We are not using mixed effects because a certain system nr can’t be at different perturbations or at different meta-ecosystem types.

\[ Total \: Regional \: Bioarea = M + D + MD \]

Time point = 2

chosen_time_point = 2
full = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance +
            metaecosystem_type * disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

Should we keep M * D?

no_MD = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

AIC(full,no_MD)
##       df      AIC
## full   5 482.5831
## no_MD  4 481.5121

No.

best_model = no_MD

par(mfrow=c(2,3))
plot(best_model, which = 1:5)

R2_full = glance(best_model)$r.squared

no_M = lm(total_regional_bioarea ~
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M

R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)

The adjusted R squared of the model is 0.17 and the adjusted R squared of patch type is 0.16.

Time point = 3

chosen_time_point = 3
full = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance +
            metaecosystem_type * disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

Should we keep M * D?

no_MD = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

AIC(full,no_MD)
##       df      AIC
## full   5 467.0762
## no_MD  4 468.2493

Yes.

best_model = full

par(mfrow=c(2,3))
plot(best_model, which = 1:5)

R2_full = glance(best_model)$r.squared

no_M = lm(total_regional_bioarea ~
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M

R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)

The adjusted R squared of the model is 0.61 and the adjusted R squared of patch type is 0.36 (which includes also the interaction with disturbance).

Time point = 4

chosen_time_point = 4
full = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance +
            metaecosystem_type * disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

Should we keep M * D?

no_MD = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

AIC(full,no_MD)
##       df      AIC
## full   5 472.5856
## no_MD  4 471.0777

No.

best_model = no_MD

par(mfrow=c(2,3))
plot(best_model, which = 1:5)

R2_full = glance(best_model)$r.squared

no_M = lm(total_regional_bioarea ~
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M

R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)

The adjusted R squared of the model is 0.21 and the adjusted R squared of patch type is 0.02.

Time point = 5

chosen_time_point = 5
full = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance +
            metaecosystem_type * disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

Should we keep M * D?

no_MD = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

AIC(full,no_MD)
##       df      AIC
## full   5 466.1591
## no_MD  4 464.1787

No.

best_model = no_MD

par(mfrow=c(2,3))
plot(best_model, which = 1:5)

R2_full = glance(best_model)$r.squared

no_M = lm(total_regional_bioarea ~
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M

R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)

The adjusted R squared of the model is 0.31 and the adjusted R squared of patch type is 0.02.

Time point = 6

chosen_time_point = 6
full = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance +
            metaecosystem_type * disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

Should we keep M * D?

no_MD = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

AIC(full,no_MD)
##       df      AIC
## full   5 439.5165
## no_MD  4 438.0414

No.

best_model = no_MD

par(mfrow=c(2,3))
plot(best_model, which = 1:5)

R2_full = glance(best_model)$r.squared

no_M = lm(total_regional_bioarea ~
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M

R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)

The adjusted R squared of the model is 0.45 and the adjusted R squared of patch type is 0.

Time point = 7

chosen_time_point = 7
full = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance +
            metaecosystem_type * disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

Should we keep M * D?

no_MD = lm(total_regional_bioarea ~
            metaecosystem_type +
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

AIC(full,no_MD)
##       df      AIC
## full   5 427.4663
## no_MD  4 425.6015

No.

best_model = no_MD

par(mfrow=c(2,3))
plot(best_model, which = 1:5)

R2_full = glance(best_model)$r.squared

no_M = lm(total_regional_bioarea ~
            disturbance,
          data = ds_regional_biomass %>%
                            filter(time_point == chosen_time_point) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

R2_no_M = glance(no_M)$r.squared
R2_M = R2_full - R2_no_M

R2_full = round(R2_full, digits = 2)
R2_M = round(R2_M, digits = 2)

The adjusted R squared of the model is 0.59 and the adjusted R squared of patch type is 0.02.

Small-Large vs Small-Large from isolated

Do a meta-ecosystem with patches of the same size and a meta-ecosystems with patches of different size have different regional biomass density because of their resource flow? Or would we see this also with small and large ecosystems that are not connected? (Do the small-large meta-ecosystems have different biomass density from two isolated small and large patches?)

Plots
ds_regional_biomass %>%
    filter ( disturbance == "low") %>%
    filter (metaecosystem_type == "S_L" | metaecosystem_type == "S_L_from_isolated") %>%
    ggplot (aes(x = day,
                y = total_regional_bioarea,
                group = system_nr,
                fill = system_nr,
                color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (µm²)",
         title = "Disturbance = low",
         fill = "System nr",
         color = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
    scale_linetype_discrete(labels = c("small-large",
                                     "small-large \n from isolated")) + 
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")
## Warning: Removed 40 row(s) containing missing values (geom_path).

ds_regional_biomass %>%
    filter ( disturbance == "high") %>%
    filter (metaecosystem_type == "S_L" | metaecosystem_type == "S_L_from_isolated") %>%
    ggplot (aes(x = day,
                y = total_regional_bioarea,
                group = system_nr,
                fill = system_nr,
                color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(title = "Disturbance = high",
         x = "Day", 
         y = "Regional bioarea (µm²?)",
         fill = "System nr",
         color = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
    scale_linetype_discrete(labels = c("small-large",
                                     "small-large \n from isolated")) + 
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")
## Warning: Removed 40 row(s) containing missing values (geom_path).

ds_regional_biomass %>%
  filter(disturbance == "low") %>%
  filter(metaecosystem_type == "S_L" | metaecosystem_type == "S_L_from_isolated") %>%
  ggplot(aes(x = day,
             y = total_regional_bioarea,
             group = interaction(day, metaecosystem_type),
             fill = metaecosystem_type)) +
  geom_boxplot() +
  labs(title = "Disturbance = low",
       x = "Day",
       y = "Regional bioarea (µm²)",
       fill = "") +
  scale_fill_discrete(labels = c("small-large", "isolated small & \n isolated large")) + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_regional_biomass %>%
  filter(disturbance == "high") %>%
  filter(metaecosystem_type == "S_L" | metaecosystem_type == "S_L_from_isolated") %>%
  ggplot(aes(x = day,
             y = total_regional_bioarea,
             group = interaction(day, metaecosystem_type),
             fill = metaecosystem_type)) +
  geom_boxplot() +
  labs(title = "Disturbance = high",
       x = "Day",
       y = "Regional bioarea (µm²)",
       fill = "") +
  scale_fill_discrete(labels = c("small-large", "isolated small & \n isolated large")) + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

Modelling

Meta-ecosystems of different total size

How does the biomass density of meta-ecosystems change according to the size of their patches?

ds_regional_biomass %>%
  filter(!metaecosystem_type == "S_L") %>%
  filter ( disturbance == "low") %>%
  ggplot (aes(x = day,
                y = total_regional_bioarea,
                group = system_nr,
                fill = system_nr,
              color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (µm²)",
         title = "Disturbance = low",
         fill = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
  scale_colour_continuous(guide = "none") +
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large-large",
                                     "medium-medium",
                                     "small-small"))  +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")
## Warning: Removed 110 row(s) containing missing values (geom_path).

ds_regional_biomass %>%
  filter(!metaecosystem_type == "S_L") %>%
  filter ( disturbance == "high") %>%
  ggplot (aes(x = day,
                y = total_regional_bioarea,
                group = system_nr,
                fill = system_nr,
              color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (µm²)",
         title = "Disturbance = high",
         fill = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
    scale_colour_continuous(guide = "none") +
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large-large",
                                     "medium-medium",
                                     "small-small"))  +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")
## Warning: Removed 104 row(s) containing missing values (geom_path).

ds_regional_biomass %>%
  filter(disturbance == "low") %>%
  filter(!metaecosystem_type == "S_L") %>%
  ggplot(aes(x = day,
             y = total_regional_bioarea,
             group = interaction(day, metaecosystem_type),
             fill = metaecosystem_type)) +
  geom_boxplot() + 
  labs(title = "Disturbance = low",
       x = "Day",
       y = "Regional bioarea (µm²)",
       fill = "") + 
  #scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large-large",
                                 "medium-medium",
                                 "small-small")) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_regional_biomass %>%
  filter(disturbance == "high") %>%
  filter(!metaecosystem_type == "S_L") %>%
  ggplot(aes(x = day,
             y = total_regional_bioarea,
             group = interaction(day, metaecosystem_type),
             fill = metaecosystem_type)) +
  geom_boxplot() + 
  labs(title = "Disturbance = high",
       x = "Day",
       y = "Regional bioarea (µm²)",
       fill = "") + 
  #scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large-large",
                                 "medium-medium",
                                 "small-small")) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

Interesting. It seems like there’s not much difference between the medium-medium and the large-large.

Local biomass

All patches

Bioarea density
for (disturbance_input in c("low", "high")) {
print(ds_biomass %>%
  group_by(culture_ID, disturbance, day, eco_metaeco_type) %>%
  summarise(bioarea_per_volume = mean(bioarea_per_volume)) %>% #Average across videos
  filter(disturbance == disturbance_input) %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day, eco_metaeco_type),
             fill = eco_metaeco_type)) +
  geom_boxplot() +
  labs(title = paste("Disturbance =", disturbance_input),
       x = "Day",
       y = "Local bioarea (µm²/µl)",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
  #      legend.position = c(.99, .999),
  #      legend.justification = c("right", "top"),
  #      legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large isolated",
                                 "large connected to large",
                                 "large connected to small",
                                 "medium isolated",
                                 "medium connected to medium",
                                 "small isolated",
                                 "small connected to large",
                                 "small connected to small")) +
  geom_vline(xintercept = first_perturbation_day + 0.6,
             linetype="dotdash",
             color = "grey",
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation"))
}

Total bioarea
ds_biomass_total = ds_biomass %>%
  filter(!culture_ID %in% ecosystems_to_take_off) %>%
  group_by(culture_ID, 
           system_nr, 
           disturbance, 
           time_point,
           day, 
           patch_size,
           patch_size_volume,
           eco_metaeco_type) %>%
  summarise(bioarea_per_volume_video_averaged = mean(bioarea_per_volume)) %>%
  mutate(total_patch_bioarea = bioarea_per_volume_video_averaged * patch_size_volume)

for (disturbance_input in c("low", "high")) {
print(ds_biomass_total %>%
  filter(disturbance == disturbance_input) %>%
  ggplot(aes(x = day,
             y = total_patch_bioarea,
             group = interaction(day, eco_metaeco_type),
             fill = eco_metaeco_type)) +
  geom_boxplot() +
  labs(title = paste("Disturbance =", disturbance_input),
       x = "Day",
       y = "Total patch bioarea (µm²)",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
  #      legend.position = c(.99, .999),
  #      legend.justification = c("right", "top"),
  #      legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large isolated",
                                 "large connected to large",
                                 "large connected to small",
                                 "medium isolated",
                                 "medium connected to medium",
                                 "small isolated",
                                 "small connected to large",
                                 "small connected to small")) +
  geom_vline(xintercept = first_perturbation_day + 0.6,
             linetype="dotdash",
             color = "grey",
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation"))
}

Small patches

How does biomass density change according to the size to which the patch is connected? (Does a small patch connected to a small patch have the same biomass density than a small patch connected to a large patch?)

Plots
for (disturbance_input in c("low", "high")) {

  print(ds_biomass %>%
  filter(disturbance == disturbance_input) %>%
  filter(patch_size == "S") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = culture_ID,
             fill = culture_ID,
             color = culture_ID,
             linetype = eco_metaeco_type)) +
  geom_line(stat = "summary", fun = "mean") +
  labs(title = paste("Disturbance =", disturbance_input),
       x = "Day",
       y = "Local bioarea (µm²/μl)",
       linetype = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("small isolated",
                                     "small connected to small",
                                     "small connected to large"))  +
  geom_vline(xintercept = first_perturbation_day,
             linetype="dotdash",
             color = "grey",
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation"))
}

for (disturbance_input in c("low", "high")) {
  
print(ds_biomass %>%
  filter(disturbance == disturbance_input) %>%
  filter(patch_size == "S") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day,eco_metaeco_type),
             fill = eco_metaeco_type)) +
  geom_boxplot() +
  labs(title = paste("Disturbance =", disturbance_input),
       x = "Day",
       y = "Local bioarea (µm²/μl)",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("small isolated",
                                 "small connected to small",
                                 "small connected to large")) +
  geom_vline(xintercept = first_perturbation_day + 0.7,
             linetype="dotdash",
             color = "grey",
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation"))
}

for (disturbance_input in c("low", "high")) {
  
print(ds_lnRR %>%
  filter(disturbance == disturbance_input) %>%
  filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
  ggplot(aes(x = day,
             y = lnRR_bioarea_density,
             color = eco_metaeco_type)) +
  geom_point(position = position_dodge(0.5)) +
  geom_line(position = position_dodge(0.5)) + 
  labs(title = paste("Disturbance =", disturbance_input),
       x = "Day",
       y = "lnRR local bioarea (µm²/µl)",
       color = "") +
  geom_errorbar(aes(ymin = lnRR_lower, 
                    ymax = lnRR_upper), 
                width = .2,
                position = position_dodge(0.5)) + 
  scale_color_discrete(labels = c("small connected to large", 
                                 "small connnected to small")) +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.40, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  #geom_vline(xintercept = first_perturbation_day + 0.7, 
  #           linetype="dotdash", 
  #           color = "grey", 
  #           size=0.7) +
  geom_hline(yintercept = 0, 
             linetype="dotted", 
             color = "black", 
             size=0.7))
}

Time series
Model selection

Let’s start from the full model (no mixed effect: meta-ecosystems have been pulled to create the lnRR):

\[ ln \: RR (bioarea \: density) = t + P + D + t*P + t*D + P*D \]

lnRR(bioarea density) = lnRR of the bioarea density (base level is calculated at each disturbance level and time point as the mean bioarea of the small isolated patches)

t = time

P = patch type

D = disturbance

We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.

first_time_point = 2
last_time_point = 7
full_model = lm(lnRR_bioarea_density ~                  day + 
                  eco_metaeco_type + 
                  disturbance +
                  day * eco_metaeco_type +
                  day * disturbance + 
                  eco_metaeco_type * disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "S (S_S)" | 
                                eco_metaeco_type == "S (S_L)"))

Should we keep t * P?

no_TP = lm(lnRR_bioarea_density ~
                  day + 
                  eco_metaeco_type + 
                  disturbance +
                  day * disturbance + 
                  eco_metaeco_type * disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "S (S_S)" | 
                                eco_metaeco_type == "S (S_L)"))

AIC(full_model, no_TP)
##            df      AIC
## full_model  8 24.13998
## no_TP       7 28.20462

Yes.

Should we keep t * D?

no_TD = lm(lnRR_bioarea_density ~
                  day + 
                  eco_metaeco_type + 
                  disturbance +
                  day * eco_metaeco_type +
                  eco_metaeco_type * disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "S (S_S)" | 
                                eco_metaeco_type == "S (S_L)"))

AIC(full_model, no_TD)
##            df      AIC
## full_model  8 24.13998
## no_TD       7 23.04001

No.

Should we keep P * D?

no_PD = lm(lnRR_bioarea_density ~
                  day + 
                  eco_metaeco_type + 
                  disturbance +
                  day * eco_metaeco_type,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "S (S_S)" | 
                                eco_metaeco_type == "S (S_L)"))

AIC(no_TD, no_PD)
##       df      AIC
## no_TD  7 23.04001
## no_PD  6 22.04118

No.

Best model

Then our best model is:

\[ lnRR (bioarea \: density) = t + P + D + t*P \]

Let’s then do some model diagnostics.

best_model = no_PD
par(mfrow = c(2,3))
plot(best_model, which = 1:5)

R2_full = glance(best_model)$r.squared
no_patch_type = lm(lnRR_bioarea_density ~
                  day + 
                  disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "S (S_S)" | 
                                eco_metaeco_type == "S (S_L)"))

R2_no_P = glance(no_patch_type)$r.squared
R2_P = R2_full - R2_no_P

R2_full = round(R2_full, digits = 2)
R2_P = round(R2_P, digits = 2)


The adjusted R squared of the model is 0.74 and the adjusted R squared of patch type is 0.27 (which includes also its interaction with disturbance).

Single time points

Large patches

How does biomass density change according to the size to which the patch is connected? (Does a large patch connected to a large patch have the same biomass density than a large patch connected to a small patch?)

Plots
for (disturbance_input in c("low", "high")) {

  print(ds_biomass %>%
  filter(disturbance == disturbance_input) %>%
  filter(patch_size == "L") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = system_nr,
             fill = system_nr,
             color = system_nr,
             linetype = eco_metaeco_type)) +
  geom_line(stat = "summary", fun = "mean") + 
  labs(x = "Day",
       y = "Local bioarea (µm²/μl)",
       title = paste("Disturbance =", disturbance_input),
       linetype = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large isolated",
                                     "large connected to large",
                                     "large connected to small")) +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation"))}

for (disturbance_input in c("low", "high")){
  print(ds_biomass %>%
  filter(disturbance == disturbance_input) %>%
  filter(patch_size == "L") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day,eco_metaeco_type),
             fill = eco_metaeco_type)) +
  geom_boxplot() +
  labs(title = paste("Disturbance =", disturbance_input),
       x = "Day",
       y = "Local bioarea (µm²/μl)",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large isolated", 
                                 "large connected to large",
                                 "large connected to small")) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation"))
  }

for (disturbance_input in c("low", "high")){
print(ds_lnRR %>%
  filter(disturbance == disturbance_input) %>%
  filter(eco_metaeco_type == "L (L_L)" | eco_metaeco_type == "L (S_L)") %>%
  ggplot(aes(x = day,
             y = lnRR_bioarea_density,
             color = eco_metaeco_type)) +
  geom_point(position = position_dodge(0.5)) +
  geom_line(position = position_dodge(0.5)) + 
  labs(title = paste("Disturbance =", disturbance_input),
       x = "Day",
       y = "lnRR local bioarea (µm²/µl)",
       color = "") +
  geom_errorbar(aes(ymin = lnRR_lower, 
                    ymax = lnRR_upper), 
                width = .2,
                position = position_dodge(0.5)) + 
  scale_color_discrete(labels = c("large connected to large", 
                                 "large connnected to small")) +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.90, .97),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
#  geom_vline(xintercept = first_perturbation_day + 0.7, 
#             linetype="dotdash", 
#             color = "grey", 
#             size=0.7) +
  geom_hline(yintercept = 0, 
             linetype="dotted", 
             color = "black", 
             size=0.7))
}

Time series
Model selection

We will exclude in the model the time point 0 and 1. At time point 0 all cultures were the same because that’s how we made them. At time point 1 no disturbance event had already taken place.

first_time_point = 2
last_time_point = 7

Let’s start from the full model (no mixed effect: meta-ecosystems have been pulled to create the lnRR):

\[ ln \: RR (bioarea \: density) = t + P + D + t*P + t*D + P*D \]

lnRR(bioarea density) = lnRR of the bioarea density (base level is calculated at each disturbance level and time point as the mean bioarea of the small isolated patches)

t = time

P = patch type

D = disturbance

full_model = lm(lnRR_bioarea_density ~
                  day + 
                  eco_metaeco_type + 
                  disturbance +
                  day * eco_metaeco_type +
                  day * disturbance + 
                  eco_metaeco_type * disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "L (L_L)" | 
                                eco_metaeco_type == "L (S_L)"))

Should we keep t * P?

no_TP = lm(lnRR_bioarea_density ~
                  day + 
                  eco_metaeco_type + 
                  disturbance +
                  day * disturbance + 
                  eco_metaeco_type * disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "L (L_L)" | 
                                eco_metaeco_type == "L (S_L)"))

AIC(full_model, no_TP)
##            df       AIC
## full_model  8 -15.52389
## no_TP       7 -17.01369

No.

Should we keep t * D?

no_TD = lm(lnRR_bioarea_density ~
                  day + 
                  eco_metaeco_type + 
                  disturbance +
                  eco_metaeco_type * disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "L (L_L)" | 
                                eco_metaeco_type == "L (S_L)"))

AIC(no_TP, no_TD)
##       df       AIC
## no_TP  7 -17.01369
## no_TD  6 -13.70210

Yes.

Should we keep P * D?

no_PD = lm(lnRR_bioarea_density ~
                  day + 
                  eco_metaeco_type + 
                  disturbance +
                  day * disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "L (L_L)" | 
                                eco_metaeco_type == "L (S_L)"))

AIC(no_TP, no_PD)
##       df       AIC
## no_TP  7 -17.01369
## no_PD  6 -17.26697

No.

Best model

Then our best model is:

\[ lnRR (bioarea \: density) = t + P + D + tP \]

Let’s then do some model diagnostics.

best_model = no_PD
par(mfrow = c(2,3))
plot(best_model, which = 1:5)

R2_full = glance(best_model)$r.squared

no_patch_type = lm(lnRR_bioarea_density ~
                  day + 
                  disturbance +
                  day * disturbance,
                  data = ds_lnRR %>%
                         filter(time_point >= first_time_point) %>%
                         filter(time_point <= last_time_point) %>%
                         filter(eco_metaeco_type== "L (L_L)" | 
                                eco_metaeco_type == "L (S_L)"))

R2_no_P = glance(no_patch_type)$r.squared
R2_P = R2_full - R2_no_P

R2_full = round(R2_full, digits = 2)
R2_P = round(R2_P, digits = 2)


The adjusted R squared of the model is 0.58 and the adjusted R squared of patch type is 0.09 (which includes also its interaction with disturbance).

Single time points

Isolated patches

How does biomass density change according to the size of isolated patches? (How does the biomass of small, medium, and large patches change?)

ds_biomass %>%
  filter ( disturbance == "low") %>%
  filter(metaecosystem == "no") %>%
  group_by (system_nr, day, patch_size) %>%
  summarise(mean_bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
  ggplot (aes(x = day,
                y = mean_bioarea_per_volume_across_videos,
                group = system_nr,
                fill = system_nr,
              color = system_nr,
                linetype = patch_size)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (µm²/µl)",
         title = "Disturbance = low",
         fill = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
  scale_colour_continuous(guide = "none") +
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large isolated",
                                     "medium isolated",
                                     "small isolated"))

ds_biomass %>%
  filter ( disturbance == "high") %>%
  filter(metaecosystem == "no") %>%
  group_by (system_nr, day, patch_size) %>%
  summarise(mean_bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
  ggplot (aes(x = day,
                y = mean_bioarea_per_volume_across_videos,
                group = system_nr,
                fill = system_nr,
              color = system_nr,
                linetype = patch_size)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (µm²/µl)",
         title = "Disturbance = low",
         fill = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
  scale_colour_continuous(guide = "none") +
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large isolated",
                                     "medium isolated",
                                     "small isolated"))

ds_biomass %>%
  filter(disturbance == "low") %>%
  filter(metaecosystem == "no") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day, patch_size),
             fill = patch_size)) +
  geom_boxplot() + 
  labs(title = "Disturbance = low",
       x = "Day",
       y = "Local bioarea (µm²/μl)",
       fill = "") + 
  scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6))

ds_biomass %>%
  filter(disturbance == "high") %>%
  filter(metaecosystem == "no") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day, patch_size),
             fill = patch_size)) +
  geom_boxplot() + 
  labs(title = "Disturbance = high",
       x = "Day",
       y = "Local bioarea (µm²/μl)",
       fill = "") + 
  scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6))

Evaporation

Was the amount of evaporation different across different treatments and time points?

We want to know if there was a systematic bias in the evaporation of different treatments (disturbance, patch size) and whether evaporation changed across time. My expectation would be that we would see a difference among the exchanges 2,3 and the exchanges 4,5,6. This is because in exchange 2,3 cultures were microwaved in 15 tubes for 3 minutes and in exchange 4,5,6 cultures were microwaved in 4 tubes for only 1 minute.

Tidy

#Columns: exchange & evaporation
ds_for_evaporation = gather(ds_for_evaporation, 
                            key = exchange, 
                            value = evaporation, 
                            water_add_after_t2:water_add_after_t6)
ds_for_evaporation[ds_for_evaporation == "water_add_after_t2"] = "2"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t3"] = "3"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t4"] = "4"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t5"] = "5"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t6"] = "6"
ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] = ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] / 2 #This is because exchange contained the topping up of two exchanges
ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] = ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] + 2 #We need to add 2 ml to the evaporation that happened at the exchange events 1 and 2. This is because we already added 1 ml of water at exchange 1 and 1 ml of water at exchange 2. 

#Column: nr_of_tubes_in_rack
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 1] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 2] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 3] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 4] = 4
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 5] = 4
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 6] = 4

Plot

ds_for_evaporation %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = as.character(nr_of_tubes_in_rack),
             y = evaporation)) + 
  geom_boxplot() + 
  labs(x = "Number of tubes in rack", 
       y = "Evaporation (ml)")

ds_for_evaporation %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = as.character(patch_size),
             y = evaporation)) + 
  geom_boxplot() + 
  labs(x = "Patch size", 
       y = "Evaporation (ml)")

ds_for_evaporation %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = as.character(day),
             y = evaporation)) + 
  geom_boxplot() + 
  labs(x = "Day", 
       y = "Evaporation (ml)")

ds_for_evaporation %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = disturbance,
             y = evaporation)) + 
  geom_boxplot() + 
  labs(x = "Disturbance", 
       y = "Evaporation (ml)")

It seems like there is no real difference across time, disturbance, or patch type. However, we could also run a mixed effect model to show that they do not.

Mixed effect model

It gives me the following error:

  • Error in fn(nM$xeval()) : Downdated VtV is not positive definite
mixed.model = lmer(evaporation  ~ 
                     patch_size * disturbance  * exchange + 
                     (exchange | culture_ID), 
                   data = ds_for_evaporation,
                   REML = FALSE, 
                   control = lmerControl (optimizer = "Nelder_Mead"))

null.model = lm(evaporation ~
                  1, 
                data = ds_for_evaporation)

anova(mixed.model, null.model)

Body size

Aim

Data

Experimental cultures

culture_info = read.csv(here("data", "PatchSizePilot_culture_info.csv"), header = TRUE)
load(here("data", "morphology", "t0.RData"));t0 = morph_mvt
load(here("data", "morphology", "t1.RData"));t1 = morph_mvt
load(here("data", "morphology", "t2.RData"));t2 = morph_mvt
load(here("data", "morphology", "t3.RData"));t3 = morph_mvt
load(here("data", "morphology", "t4.RData"));t4 = morph_mvt
load(here("data", "morphology", "t5.RData"));t5 = morph_mvt
load(here("data", "morphology", "t6.RData"));t6 = morph_mvt
load(here("data", "morphology", "t7.RData"));t7 = morph_mvt
rm(morph_mvt)

datatable(culture_info[,1:10],
          rownames = FALSE,
          options = list(scrollX = TRUE),
          filter = list(position = 'top', 
                        clear = FALSE))

Body size data-set

### --- Tidy t0 - t7 data-sets --- ###

#Column: time
t0$time = NA
t1$time = NA

#Column: replicate_video
t0$replicate_video[t0$file == "sample_00001"] = 1
t0$replicate_video[t0$file == "sample_00002"] = 2
t0$replicate_video[t0$file == "sample_00003"] = 3
t0$replicate_video[t0$file == "sample_00004"] = 4
t0$replicate_video[t0$file == "sample_00005"] = 5
t0$replicate_video[t0$file == "sample_00006"] = 6
t0$replicate_video[t0$file == "sample_00007"] = 7
t0$replicate_video[t0$file == "sample_00008"] = 8
t0$replicate_video[t0$file == "sample_00009"] = 9
t0$replicate_video[t0$file == "sample_00010"] = 10
t0$replicate_video[t0$file == "sample_00011"] = 11
t0$replicate_video[t0$file == "sample_00012"] = 12
t1$replicate_video = 1 #In t1 I took only 1 video/culture
t2$replicate_video = 1 #In t2 I took only 1 video/culture
t3$replicate_video = 1 #In t3 I took only 1 video/culture
t4$replicate_video = 1 #In t4 I took only 1 video/culture
t5$replicate_video = 1 #In t5 I took only 1 video/culture
t6 = t6 %>% rename(replicate_video = replicate)
t7 = t7 %>% rename(replicate_video = replicate)


### --- Create ds_body_size dataset --- ###

long_t0 = t0 %>% slice(rep(1:n(), max(culture_info$culture_ID)))
ID_vector = NULL
ID_vector_elongating = NULL
for (ID in 1:max(culture_info$culture_ID)){
  ID_vector = rep(ID, times = nrow(t0))
  ID_vector_elongating = c(ID_vector_elongating, ID_vector)
}
long_t0$culture_ID = ID_vector_elongating
t0 = merge(culture_info,long_t0, by="culture_ID"); rm(long_t0)
t1 = merge(culture_info,t1,by="culture_ID")
t2 = merge(culture_info,t2,by="culture_ID")
t3 = merge(culture_info,t3,by="culture_ID")
t4 = merge(culture_info,t4,by="culture_ID")
t5 = merge(culture_info,t5,by="culture_ID")
t6 = merge(culture_info,t6,by="culture_ID")
t7 = merge(culture_info,t7,by="culture_ID")
ds_body_size = rbind(t0, t1, t2, t3, t4, t5, t6, t7); rm(t0, t1, t2, t3, t4, t5, t6, t7)

### --- Tidy ds_body_size data-set --- ###

#Column: day
ds_body_size$day = ds_body_size$time_point;
ds_body_size$day[ds_body_size$day=="t0"] = "0"
ds_body_size$day[ds_body_size$day=="t1"] = "4"
ds_body_size$day[ds_body_size$day=="t2"] = "8"
ds_body_size$day[ds_body_size$day=="t3"] = "12"
ds_body_size$day[ds_body_size$day=="t4"] = "16"
ds_body_size$day[ds_body_size$day=="t5"] = "20"
ds_body_size$day[ds_body_size$day=="t6"] = "24"
ds_body_size$day[ds_body_size$day=="t7"] = "28"
ds_body_size$day = as.numeric(ds_body_size$day)

#Column: time point
ds_body_size$time_point[ds_body_size$time_point=="t0"] = 0
ds_body_size$time_point[ds_body_size$time_point=="t1"] = 1
ds_body_size$time_point[ds_body_size$time_point=="t2"] = 2
ds_body_size$time_point[ds_body_size$time_point=="t3"] = 3
ds_body_size$time_point[ds_body_size$time_point=="t4"] = 4
ds_body_size$time_point[ds_body_size$time_point=="t5"] = 5
ds_body_size$time_point[ds_body_size$time_point=="t6"] = 6
ds_body_size$time_point[ds_body_size$time_point=="t7"] = 7
ds_body_size$time_point = as.character(ds_body_size$time_point)

#Column: eco_metaeco_type
ds_body_size$eco_metaeco_type = factor(ds_body_size$eco_metaeco_type, 
                             levels=c('S', 'S (S_S)', 'S (S_L)', 'M', 'M (M_M)', 'L', 'L (L_L)', 'L (S_L)'))

#Select useful columns
ds_body_size = ds_body_size %>% 
  select(culture_ID, 
         patch_size, 
         disturbance, 
         metaecosystem_type, 
         mean_area, 
         replicate_video, 
         day, 
         metaecosystem, 
         system_nr, 
         eco_metaeco_type)

#Reorder columns
ds_body_size = ds_body_size[, c("culture_ID", 
            "system_nr", 
            "disturbance", 
            "day",
            "patch_size", 
            "metaecosystem", 
            "metaecosystem_type", 
            "eco_metaeco_type", 
            "replicate_video",
            "mean_area")]

datatable(ds_body_size,
          rownames = FALSE,
          options = list(scrollX = TRUE),
          filter = list(position = 'top', 
                        clear = FALSE))
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html

Size classes data-set

I am here creating 12 size classes as in Jacquet, Gounand, and Altermatt (2020). However, for some reason it seems like our body size classes are really different.

#### --- PARAMETERS & INITIALISATION --- ###

nr_of_size_classes = 12
largest_size = max(ds_body_size$mean_area)
size_class_width = largest_size/nr_of_size_classes
size_class = NULL

### --- CREATE DATASET --- ###

size_class_boundaries = seq(0, largest_size, by = size_class_width)

for (class in 1:nr_of_size_classes){
  
  bin_lower_limit = size_class_boundaries[class]
  bin_upper_limit = size_class_boundaries[class+1]
  size_input = (size_class_boundaries[class] + size_class_boundaries[class + 1])/2
  
  size_class[[class]] = ds_body_size%>%
    filter(bin_lower_limit <= mean_area) %>%
    filter(mean_area <= bin_upper_limit) %>%
    group_by(culture_ID, 
             system_nr, 
             disturbance, 
             day, 
             patch_size, 
             metaecosystem, 
             metaecosystem_type, 
             eco_metaeco_type, 
             replicate_video) %>% #Group by video
    summarise(mean_abundance_across_videos = n()) %>%
    group_by(culture_ID, 
             system_nr, 
             disturbance, 
             day, 
             patch_size, 
             metaecosystem, 
             metaecosystem_type, 
             eco_metaeco_type) %>% #Group by ID
    summarise(abundance = mean(mean_abundance_across_videos)) %>%
    mutate(log_abundance = log(abundance)) %>%
    mutate(size_class = class) %>%
    mutate(size = size_input) %>%
    mutate(log_size = log(size))
  
}

ds_classes = rbind(size_class[[1]], size_class[[2]], size_class[[3]], size_class[[4]],
                  size_class[[5]], size_class[[6]], size_class[[7]], size_class[[8]],
                  size_class[[9]], size_class[[10]], size_class[[11]], size_class[[12]],)

datatable(ds_classes,
          rownames = FALSE,
          options = list(scrollX = TRUE),
          filter = list(position = 'top', 
                        clear = FALSE))

Plot

Comparison of different ecosystem types across time

#Trying out gganimate, but I can't seem to manage to install transformr packaget
p = list()
n = 0
first_level = c("isolated small", "isolated small", "isolated large", "isolated large")
second_level = c("small connected to small", "small connected to small", "large connected to large", "large connected to large")
third_level = c("small connected to large", "small connected to large", "large connected to small", "large connected to small")
for (patch_size_input in c("S", "L")){
  
  for(disturbance_input in c("low", "high")){
  
    n = n + 1
      
  title = paste0(patch_size_input,
              ' patches, Disturbance = ',
              disturbance_input, 
              ', Day: {round(frame_time, digits = 0)}')
  
  p[[n]] <- ds_classes %>%
  filter(disturbance == disturbance_input) %>%
  filter(patch_size == patch_size_input) %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, eco_metaeco_type),
             color = eco_metaeco_type)) +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group=eco_metaeco_type)) +
  scale_color_discrete(labels = c(first_level[n], 
                                 second_level[n],
                                 third_level[n])) +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  labs(title = title,
       x = 'Log size (μm2)', 
       y = 'Log abundance + 1 (indiv/μm2)',
       color = "") +
  transition_time(day) +
  ease_aes('linear')
  
  animate(p[[n]], 
        duration = 10,
        fps = 25, 
        width = 500, 
        height = 500, 
        renderer = gifski_renderer())
  
  anim_save(here("gifs", 
                 paste0("transition_day_", 
                        patch_size_input,"_", 
                        disturbance_input, 
                        ".gif")))
}
}

  • Next step: making plots slower (the package stopped working because I broke some dependencies)

Figures

Regional biomass under low disturbance in meta-ecosystems of the same total area, but exhibiting two local patches of the same size (red, medium-medium meta-ecosystem) or different size (blue, small-large meta-ecosystem). Points represent the mean, error bars represent the standard deviation. See the Appendix for equivalent figure of the high disturbance treatment.

Regional biomass under low disturbance in meta-ecosystems of the same total area, but exhibiting two local patches of the same size (red, medium-medium meta-ecosystem) or different size (blue, small-large meta-ecosystem). Points represent the mean, error bars represent the standard deviation. See the Appendix for equivalent figure of the high disturbance treatment.

Tests

Evaporation when microwaving 15 falcon tubes at the time

evaporation.test = read.csv(here("data", "evaporation_test","evaporation_test_right.csv"), header = TRUE)

evaporation.test %>%
  ggplot(aes (x = as.character(water_pipetted),
                y = weight_water_evaporated,
                group = interaction(water_pipetted, as.character(rack)),
                fill = as.character(rack))) +
  geom_boxplot() +
  labs(x = "Water volume (ml)" , 
       y = "Evaporation (g)", 
       fill = "Rack replicate")

Evaporation when microwaving 5 tubes with 10 filled or empty tubes

evaporation.test = read.csv(here("data", "evaporation_test", "evaporation_test_fill_nofill.csv"), header = TRUE)

evaporation.test %>%
  ggplot(aes (x = all_tubes_water,
              y = weight_water_evaporated)) +
  geom_boxplot() +
  labs(x = "Water in the other 10 tubes" , 
  y = "Evaporation (g)", 
  caption = "When all tubes were filled, they were filled with 6.75 ml of deionised water.")

Running time

## Time difference of 56.25339 secs

Other

Mixed effects models

  • To build the mixed effect models we will use the R package lme4. See page 6 of this PDF to know more about the syntaxis of this package and this link for the interaction syntaxis.

  • To do model diagnostics of mixed effect models, I’m going to look at the following two plots (as suggested by Zuur et al. (2009), page 487):

    • Quantile-quantile plots (plot(mixed_model))

    • Partial residual plots (qqnorm(resid(mixed_model)))

  • The effect size of the explaining variables is calculated in the mixed effect models as marginal and conditional r squared. The marginal r squared is how much variance is explained by the fixed effects. The conditional r squared is how much variance is explained by the fixed and the random effects. The marginal and conditional r squared are calculated using the package MuMIn. The computation is based on the methods of Nakagawa, Johnson, and Schielzeth (2017). For the coding and interpretation of these r squared check the documentation for the r.squaredGLMM function

  • Time can be included as a fixed or random effect. Time can be included as a random effect if the different data points are non independent from each other (e.g., seasons). However, because the biomass in our experiment was following a temporal trend, the different time points show autocorrelation. In other words, t2 is more similar to t3 than t4 and so on. This is why we decided to include time as a fixed effect. For an excellent discussion on this topic see this blog post.

Modeling choices

  • I am going to select the best model according to AIC. Halsey (2019) suggests this approach instead of p values. P-values are not a reliable way of choosing a model because:

    • My sample size is small, producing larger p-values

    • P-values are really variable, creating many false positives and negatives (e.g., if p=0.05 there is a 1 in 3 chance that it’s a false positive)

  • To study the local biomass how it changes across treatments, we could have made three different models between the three combinations of small patches. However, that might be confusing to interpret the results. We decided instead to use an effect size where we control is the isolated small patch. At the beginning we thought to use the natural logarithm of the response ratio (lnRR). The problem, however, is that some bioarea values were 0. We were thinking to add 1 to all null values, but according to Rosenberg, Rothstein, and Gurevitch (2013), such practice inflates effect sizes. Because of this, I looked into other types of effect size. I found that the most common and preferred metric in use today is known as Hedge’s d (a.k.a. Hedge’s g) (Hedges, Larry V. and Olkin (1985) ). It is calculated as the difference in mean between treatment and control divided by the standard deviation of the pooled data. Another measure would be Cohen’s d, but it underperforms with sample sizes that are lower than 20 (StatisticsHowTo). I can easily calculate the Hedge’s d using the r package effsize.
    Same thing for the large patches.

To do

  • Local biomass
    • Data

      • Make the code for the confidence intervals of lnRR of the local biomass cleaner
    • Small patches

      • Look at error structure of the model. If the local biomass is normally distributed use lm, otherwise use glm

      • Single time points

    • Large patches

      • Look at error structure

      • Single time points

  • Regional biomass
    • M_M vs S_L

      • calculate R2 for meta-ecosystem type

      • Make a table with how r2 of M changes across time points

    • S_L effects of flow

      • Think if it would be better to look at it as an effect size

      • Plot again

      • Model again

  • Body size
    • Fix the gganimate problem
  • Figures
    • Add small patches lnRR low disturbance

Bibliography

Halsey, Lewis G. 2019. The reign of the p-value is over: What alternative analyses could we employ to fill the power vacuum? Biology Letters 15 (5). https://doi.org/10.1098/rsbl.2019.0174.
Hedges, Larry V., and Ingram Olkin. 1985. Statistical Methods for Meta-Analysis.
Jacquet, Claire, Isabelle Gounand, and Florian Altermatt. 2020. How pulse disturbances shape size-abundance pyramids.” Ecology Letters 23 (6): 1014–23. https://doi.org/10.1111/ele.13508.
Nakagawa, Shinichi, Paul C. D. Johnson, and Holger Schielzeth. 2017. The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded.” Journal of the Royal Society Interface 14 (134). https://doi.org/10.1098/rsif.2017.0213.
Rosenberg, Michael S., Hannah R. Rothstein, and Jessica Gurevitch. 2013. Effect sizes: Conventional choices and calculations.” Handbook of Meta-Analysis in Ecology and Evolution, 61–71. https://doi.org/10.23943/princeton/9780691137285.003.0006.
Zuur, Alain F., Elena N. Ieno, Neil Walker, Anatoly A. Saveliev, and Graham M. Smith. 2009. Mixed effects models and extensions in ecology with R. Vol. 36. Statistics for Biology and Health. New York, NY: Springer New York. https://doi.org/10.1007/978-0-387-87458-6.